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(54) Title: A METHOD OF PROCESSING AN AUDIO SIGNAL 
(57) Abstract 



A method of processing a single 
channel audio signal to provide an audio 
signal having left and right channels cor- 
responding to a sound source at a given 
direction in space, includes performing 
a binaural synthesis introducing a time 
delay between the channels correspond- 
ing to the inter-aural time difference for 
a signal coming from said given direc- 
tion, and controlling the left ear signal 
magnitude and the right ear signal mag- 
nitude to be at respective values. These 
values are determined by choosing a po- 
sition for the sound source relative to the 
position of the head of a listener in use, 
calculating the distance from the chosen 
position of the sound source to respec- 
tive ears of the listener, and determining 
the corresponding left ear signal magni- 
tude and right ear signal magnitude us- 
ing the inverse square law dependence 
of sound intensity with distance to pro- 
vide cues for perception of the distance 
of said sound source in use. 
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A METHOD OF PROCESSING AN AUDIO SIGNAL 

This invention relates to a method of processing a single channel audio 
signal to provide an audio signal having left and right channels corresponding to 
5 a sound source at a given direction in space relative to a preferred position of a 
listener in use, the information in the channels including cues for perception of the 
direction of said single channel audio signal from said preferred position, the 
method including the steps of: a) providing a two channel signal having the same 
single channel signal in the two channels; b) modifying the two channel signal by 

10 modifying each of the channels using one of a plurality of head response transfer 
functions to provide a right signal in one channel for the right ear of a listener and 
a left signal in the other channel for the left ear of the listener; and c) introducing a 
time delay between the channels corresponding to the inter-aural time difference 
for a signal coming from said given direction, the inter-aural time difference 

15 providing cues to perception of the direction of the sound source at a given time. 

The processing of audio signals to reproduce a three dimensional soxmd- 
field on replay to a listener having two ears has been a goal for inventors since the 
invention of stereo by Alan Blumlein in the 1930*s. One approach has been to use 
many sound reproduction channels to surroimd the listener with a multiplicity of 

20 soimd sources such as loudspeakers. Another approach has been to use a dummy 
head having microphones positioned in the auditory canals of artificial ears to 
make soimd recordings for headphone listening. An especially promising 
approach to the binaural S5mthesis of such a sound-field has been described in EP- 
B-0689756, which describes the synthesis of a soimd-field using a pair of 

25 loudspeakers and only two signal channels, the sound-field nevertheless having 
directional information allowing a listener to perceive sound sources appearing to 
lie an)nA^here on a sphere surrounding the head of a listener placed at the centre of 
the sphere. 

A drawback with such systems developed in the past has been that 
30 although the recreated soimd-field has directional information, it has been 

difficult to recreate the perception of having a sound source which is close to the 
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listener, typically a source which appears to be closer than about 1.5 metres from 
the head of a listener. Such soimd effects would be very effective for computer 
games for example, or any other application when it is desired to have sounds 
appearing to emanate from a position in space close to the head of a listener, or a 
5 sound source which is perceived to move towards or away from a listener with 
time, or to have the sensation of a person whispering in the listener's ear. 

According to a first aspect of the invention there is provided a method as 
specified in claims 1 - 11. According to a second aspect of the invention there is 
provided apparatus as specified in claim 12. According to a third aspect of the 

10 invention there is provided an audio signal as specified in claim 13, 

Embodiments of the invention will now be described, by way of example 
only, with reference to the accompanying diagranunatic drawings, in which 
Figure 1 shows the head of a listener and a co-ordinate system. 
Figure 2 shows a plan view of the head and an arriving soimd wave, 

15 Figure 3 shows the locus of points having an equal inter-aural or inter-aural time 
delay. 

Figure 4 shows an isometric view of the locus of Figure 3, 
Figure 5 shows a plan view of the space surrounding a listener's head. 
Figure 6 shows further plan views of a listener's head showing paths for use in 
20 calculations of distance to the near ear. 

Figure 7 shows further plan views of a listener's head showing paths for use in 

calculations of distance to Hie far ear. 

Figure 8 shows a block diagram of a prior art metiiod. 

Figure 9 shows a block diagram of a method according to the present invention, 
25 Figure 10 shows a plot of near ear gain as a function of azimuth and distance, and 
Figure 11 shows a plot of far ear gain as a function of azimuth and distance. 

The present invention relates particularly to the reproduction of 3D-sound 
from two-speaker stereo systems or headphones. This type of 3D-sound is 
30 described, for example, in EP-B-0689756 which is incorporated herein by 
reference. 
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It is well known that a mono sound source can be digitally processed via a 
pair of "Head-Response Transfer Functions" (HRTFs), such that the resultant 
stereo-pair signal contains 3D-sound cues. These sound cues are introduced 
naturally by the head and ears when we listen to sounds in real life, and they 
5 include the inter-aural amplitude difference (IAD), inter-aural time difference 
(IT U) and spectral shaping by the outer ear. When this stereo signal pair is 
introduced efficiently into the appropriate ears of the listener, by headphones say, 
then he or she perceives the original soimd to be at a position in space in 
accordance with the spatial location of the HRTF pair which was used for the 

10 signal-processing. 

When one listens through loudspeakers instead of headphones, then the 
signals are not conveyed efficiently into the ears, for there is "transaural acoustic 
crosstalk" present which inhibits the 3D-sound cues. This means that the left ear 
hears a little of what the right ear is hearing (after a small, additional time-delay of 

15 around 0.2 ms), and vice versa. Jn order to prevent this happening, it is known to 
create appropriate ''crosstalk cancellation" signals from the opposite loudspeaker. 
These signals are equal in magnitude and inverted (opposite in phase) with 
respect to the crosstalk signals, and designed to cancel them out. There are more 
advanced schemes which anticipate ttie secondary (and higher order) effects of the 

20 cancellation signals themselves contributing to secondary crosstalk, and the 
correction thereof, and these methods are known in the prior art. 

When the HRTF processing and crosstalk cancellation are carried out 
correctly, and using high quality HRTF source data, then the effects can be quite 
remarkable. For example, it is possible to move the virtual image of a soimd- 

25 source around the listener in a complete horizontal circle, beginning in front, 
moving aroimd the right-hand side of the listener, behind the listener, and back 
arotind the left-hand side to the front again. It is also possible to make the sound 
source move in a vertical circle around the listener, and indeed make the soimd 
appear to come from any selected position in space. However, some particular 

30 positions are more difficult to synthesise than others, some for psychoacoustic 
reasons, we believe, and some for practical reasons. 
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For example, the effectiveness of soiind sources moving directly upwards 
and downwards is greater at the sides of the listener (azimuth = 90°) than directly 
in front (azimuth = 0°). This is probably because there is more left-right difference 
information for the brain to work with. Similarly, it is difficult to differentiate 
between a soimd source directly in front of the listener (azimuth = 0°) and a 
sotirce directly behind the listener (azimuth = 180°). This is because there is no 
time-domain information present for the brain to operate with (TTD = 0), and the 
only other information available to the brain, spectral data, is similar in both of 
these positions. In practice, there is more HF energy perceived when the source is 
in front of the listener, because the high frequencies from frontal sources are 
reflected into the auditory canal from the rear wall of the concha, whereas from a 
rearward source, they carmot diffract aroimd the piima sufficiently to enter the 
auditory canal effectively. 

In practice, it is known to make measurements from an artificial head in 
order to derive a library of HRTF data, such that 3D-soimd effects can be 
synthesised. It is corrunon practice to make these measurements at distances of 1 
metre or thereabouts, for several reasons. Firstly, the soimd source used for such 
measurements is, ideally, a point source, and usually a loudspeaker is used. 
However, there is a physical limit on the minimum size of loudspeaker 
diaphragms. Typically, a diameter of several inches is as small as is practical 
whilst retaining ttie power capability and low-distortion properties which are 
needed. Hence, in order to have the effects of these loudspeaker sigrials 
representative of a point source, the loudspeaker must be spaced at a distance of 
aroimd 1 metre from the artificial head. Secondly, it is usually required to create 
soimd effects for PC games and the like which possess apparent distances of 
several metres or greater, and so, because there is little difference between HRTFs 
measured at 1 metre and those measured at much greater distances, the 1 metre 
measurement is used. 

The effect of a sound source appearing to be in the mid-distance (1 to 5 m, 
say) or far-distance (>5 m) can be created easily by the addition of a reverberation 
signal to the primary signal, thus simulating the effects of reflected sound waves 
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from the floor and walls of the environment. A reduction of the high frequency 
(HF) components of the soxmd source can also help create the effect of a distant 
source, simulating the selective absorption of HF by air, although this is a more 
subtle effect. In summary, the effects of controlling the apparent distance of a 
sound source beyond several metres are known. 

However, in many PC games situations, it is desirable to have a soimd 
effect appear to be very close to the listener. For example, in an adventure game, 
it might be required for a "guide" to whisper instructions into one of the listener's 
ears, or alternatively, in a flight-simulator, it might be required to create the effect 
that the listener is a pilot, hearing air-traffic information via headphones. In a 
combat game, it might be required to make bullets appear to fly close by the 
listener's head. These effects are not possible with HRTFs measured at 1 metre 
distance. 

It is therefore desirable to be able to create "near-field" distance effects, in 
which the sound source can appear to move from the loudspeaker distance, say, 
up close to the head of the Ustener, and even appear to "whisper" into one of the 
ears of the listener. In principle, it might be possible to make a full set of HRTF 
measurements at differing distances, say 1 metre, 0.9 metre, 0.8 metre and so on, 
and switch between these different libraries for near-field effects. However, as 
already noted above, the measurements are compromised by the loudspeaker 
diaphragm dimensions which depart from point-source properties at these 
distances. Also, an immense effort is required to make each set of HRTF 
measurements (typicaUy, an HRTF library might contain over 1000 HRTF pairs 
which take several man weeks of effort to measure, and then a similar time is 
required to process these into useable filter coefficients), and so it would be very 
costly to do this. Also, it would require considerable additional memory space to 
store each additional HRTF Ubrary in the PC. A further problem would be that 
such an approach would result in quantised-distance effects: the soimd source 
could not move smoothly to the listener's head, but would appear to "jump" 
when switching between the different HRTF sets. 
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Ideally, what is required is a means of creating near-field distance effects 
using a "standard" 1 metre HRTF set. 

The present invention comprises a means of creating near-field distance 
effects for 3D-sound synthesis using a "standard" 1 metre HRTF set. The method 
uses an algorithm which controls the relative left-right channel amplitude 
difference as a function of (a) required proximity, and (b) spatial position. The 
algorithm is based on the observation that when a soimd source moves towards 
the head from a distance of 1 metre, then the individual left and right-ear 
properties of the HRTF do not change a great deal in terms of tiieir spectral 
properties. However, their amplitudes, and the amplitude difference between 
them, do change substantially, caused by a distance ratio effect. The small changes 
in spectral properties which do occur are related largely to head-shadowing 
effects, and these can be incorporated into the near-field effect algorithm in 
addition if desired. 

In the present context, the ejq>ression "near-field" is defined to mean that 
voliame of space around the listener's head up to a distance of about 1 - 1.5 metre 
from the centre of the head. For practical reasoi>s, it is also useful to define a 
"closeness limit", and a distance of 0.2 m has been chosen for the present purpose 
of illustrating the invention. These limits have both been chosen purely for 
descriptive purposes, based respectively upon a typical HRTF measurement 
distance (1 m) and the closest simulation distance one might wish to create, in a 
game, say. However, it is also important to note that the ultimate "closeness" is 
represented by the listener hearing the sound ONLY in a single ear, as would be 
the case if he or she were wearing a single earphone. This, too, can be simulated, 
and car\ be regarded as the ultimately limiting case for close to head or "near-field" 
effects. This "whispering in one ear effect" can be achieved simply by setting the 
far ear gain to zero, or to a sufficiently low value to be inaudible. Then, when the 
processed audio signal is is auditioned on headphones, or via speakers after 
appropriate transanal crosstalk cancellation processing, the soimds appear to be 
"in the ear". 
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First, consider for example the amplitude changes. When the sound source 
moves towards the head from 1 mefre distance, the distance ratio (left-ear to 
soimd source vs. right-ear to sound source) becomes greater. For example, for a 
soimd sourcw at 45° azimuth in the horizontal plane, at a distance of 1 metre from 
the centre of the head, the near ear is about 0.9 metre distance and the far-ear 
around 1.1 metre. So the ratio is (1,1 / 0.9) = 1.22. When the sound source moves 
to a distance of 0.5 mefre, then the ratio becomes (0.6 / 0.4) = 1.5, and when the 
distance is 20 cm, then the ratio is approximately (0.4 / 0.1) = 4. The intensity of a 
sound source diminishes with distance as the energy of the propagating wave is 
spread over an increasing area. The wavefront is similar to an expanding bubble, 
and the energy density is related to the surface area of the propagating wavefront, 
which is related by a square law to the distance fravelled (the radius of the 
bubble). 

This gives the well known inverse square law reducion in intensity with 
distance travelled for a point source. The intensity ratios of left and right channels 
are related to ttie inverse ratio of the squares of the distances. Hence, iiie intensity 
ratios for distances of 1 m, 0.5 m and 0.2 m are approximately 1.49, 2.25 and 16 
respectively. In dB units, these ratios are 1.73 dB, 3.52 dB and 12.04 dB 
respectively. 

Next, consider the head-shadowing effects. When a sound source is 1 mefre 
from the head, at azimuth 45°, say, then the incoming souiui waves only have one- 
quarter of the head to fravel around in order to reach the furthermost ear, lying in 
tiie shadow of the head. However, when the sound soxirce is much closer, say 20 
cm, than the waves have an entire hemisphere to circxmmavigate before they can 
reach the furthermost ear. Consequently, the HF components reaching the 
furthermost ear are proportionately reduced. 

It is important to note, however, that the situation is more complicated than 
described in the above example, because the intensity ratio differences are 
position dependent. For example, if the aforementioned situation were repeated 
for a frontal soimd source (azimuth 0°) approaching the head, then there would be 
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no difference between the left and right channel intensities, because of symmetry. 
In this instance, the intensity level would simply increase according to the inverse 
square law. 

How then might it be possible to link any particular, close, position in three 
dimensional space with an algorithm to control the L and R channel gains 
correctly and accurately? The key factor is the inter-aural time delay, for this can 
be used to index the algorithm to spatial position in a very effective and efficient 
manner. 

The invention is best described in several stages, beginning with an account 
of the inter-aural time-delay and followed by derivations of approximate near-ear 
and far-ear distances in the listener's near-field. Figure 1 shows a diagram of the 
near-field space around the listener, together with the reference planes and axes 
which will be referred to during the following descriptions, in which P-P' 
represents the front-back axis in the horizontal plane, intercepting the centre of the 
listener's head, and with Q-Q' representing the corresponding lateral axis from 
left to right. 

As has already been noted, there is a time-of-arrival difference between the 
left and right ears when a sound wave is incident upon the head, unless the sound 
source is in the median plane, which includes the pole positions (i.e. directly in 
front, behind above and below). This is known as the inter-aural time delay (ITD), 
and can be seen depicted in diagram form in Figure 2, which shows a plan view of 
a conceptual head, with left ear and right ear receiving a soimd signal from a 
distant source at azimuth angle 9 (about -i^S'' as shown here). When the 
wavefront (W - W) arrives at the right ear, then it can be seen that there is a path 
length of (a + b) still to travel before it arrives at the left ear (LE). By the synrmietry 
of the configuration, the b section is equal to the distance from the head centre to 
wavefront W - W, and hence: b = r.sin G. It will be clear that the arc a represents 
a proportion of the circumference, subtended by 6. By inspection, then, the path 
length (a+b) is given by: 
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path length = [^^^ j^^'' + ® 

(This path length (in cm units) can be converted into the corresponding time-delay 
value (in ms) by dividing by 34.3.) 

It can be seen that, in the extreme, when 9 tends to zero, so does the 
path length. Also, when 8 tends to 90°, and the head diameter is 15 cm, then the 
path length is about 19.3 cm, and the associated UD is about 563 \is. In practice, 
the ITDs are measured to be slightly larger than this, typically up to 702 \is. It is 
likely that this is caused by the non-spherical nature of the head (including the 
presence of the pinnae and nose), the complex diffractive situation and surface 
effects. 

At this stage, it is important to appreciate that, although this derivation 
relates only to the front-right quadrant in the horizontal plane (angles of azimuth 
between 0° and 90°), it is valid in all four qxaadrants. This is because (a) the front- 
right and right-rear quadrants are symmetrical about the Q-Q' axis, and (b) the 
right two quadrants are symmetrical with the left two quadrants. (Naturally, in 
this latter case, the time-delays are reversed, with the left-ear signal leading the 
right-ear signal, rather than lagging it). 

Consequently, it will be appreciated that there are two complementary 
positions in the horizontal plane associated with any particular (valid) time delay, 
for example 30° & 150°; 40° & 140°, and so on. In practice, measurements show 
that the time-delays are not truly symmetrical, and iiidicate, for example, that the 
maximum time delay occurs not at 90° azimuth, but at around 85°. These small 
asymmetries will be set aside for the moment, for clarity of description, but it will 
be seen that use of the time-delay as an index for the algorithm takes into accoimt 
all of the detailed non-symmetries, thus providing a faithful means of simulating 
close soimd sources- 
Following on from this, if one considers the head as an approximately 
spherical object, one can see that the symmetry extends into the third dimension, 
where the upper hemisphere is symmetrical to the lower one, mirrored around the 
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horizontal plane. Accordingly, it can be appreciated that, for a given (valid) inter- 
aural time-delay, there exists not jxist a pair of points on the horizontal (h-) plane, 
but a locus, approximately circular, which intersects the h-plane at the 
aforementioned points. In fact, the locus can be depicted as the surface of an 
imaginary cone, extending from the appropriate listener's ear, aligned with the 
lateral axis Q-Q' (Figures 3 and 4). 

At this stage, it is important to note that: 

(1) the inter-aural time-delay represents a very close approximation of the 
relative acoustic path length difference between a sound source and each of 
the ears; and 

(2) the inter-aural time-delay is an integral feature of every HRTF pair. 

Consequently, when any 3D-sound synthesis system is using HRTF data, 
the associated inter-aural time delay can be used as an excellent index of relative 
path length difference. Because it is based on physical measurements, it is 
therefore a true measure, incorporating the various real-life non-linearities 
described above. 

The next stage is to find out a means of determining the value of the signal 
gairis which must be applied to the left and right-ear channels when a "close" 
virtual soimd source is required. This can be done if the near- and far-ear 
situations are considered in turn, and if we use the 1 metre distance as the 
outermost reference datum, at which point we define the soxmd intensity to be 0 
dB. 

Figure 5 shows a plan view of the listener's head, together with the near- 
field area surrounding it. In the first instance, we are particularly interested in the 
front-right quadrant. If we can define a relationship between near-field spatial 
position in the h-plane and distance to the near-ear (right ear in this case), then 
this can be used to control the right-channel gain. The situation is trivial to 
resolve, as shown in Figure 6, if the "true" source-to-ear paths for the close frontal 
positions (such as path "A") are assumed to be similar to the direct distance 
(indicated by "B"). This simplifies the situation, as is shown on the left diagram of 
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Figure 6, indicating a sound source S in the front-right quadrant, at an azimuth 
angle of e with respect to the listener. Also shown is the distance, d, of the sound 
source from the head centre, and the distance, p, of the sound source from the 
near-ear. The angle subtended by S-head-Q' is (90° - 6). The near-ear distance can 
be derived using the cosine rule, from triangle S-head_centre-near_ear: 

p^=d^+r^- 2dr.cos(90 -e)||3° (2) 
If we assume the head radius, r, is 7.5 cm, then p is given by: 



p=:^d^+(75f-l5d.smQ\l^ (3) 

Figure 7 shows a plan view of the listener's head, together with the near- 
field area surrounding it. Once again, we are particularly interested in the front- 
right quadrant. However, the path between the sound source and the far-ear 
comprises two serial elements, as is shown clearly in the right hand detail of 
Figure 7. First, there is a direct path from the source, S, tangential to the head, 
labelled q, and second, there is a circtunferential path aroxmd the head, C, from 
the tangent point, T, to the far-ear. As before, the distance from the sound source 
to the centre of the head is d, and the head radius is r. The angle subtended by the 
tangent point and the head centre at the source is angle R. 

The tangential patti, q, can be calculated simply from the triangle: 



q^^jid^) (4) 

and also the angle R: 



(5) 
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Considering the triangle S-T-head_centre, the angle P-head_centre-T is (90 - 
e - R), and so the angle T-head_centre-Q (the angle subtended by the arc itself) 
must be (9 + R). The circumferential path can be calculated from this angle, and is: 

C=ji±^W (6) 
1 360 J 

Hence, by substituting (5) into (6), and combining with (4), an expression 
for the total distance (in cm) from soimd source to far-ear for a 7.5 cm radius head 
can be calculated: 



10 



Far-Ear Total Path = ^(d^ -75^) + Inr 



e+sin-'f— 1^ 

d J 



360 



(7) 



15 



It is instructive to compute the near-ear gain factor as a function of azimuth 
angle at several distances from the listener's head. This has been done, and is 
depicted graphically in Figure 10. The gain is expressed in dB units with respect to 
the 1 metre distance reference, defined to be 0 dB. The gain, in dB, is calculated 
according to the inverse square law from path length, d (in cm), as: 



gain (dB) = lOlog 



10^ 



(8) 



20 As can be seen from the graph, the 100 cm line is equal to 0 dB at azimuth 

0^ as one expects, and as the sound source moves aroimd to the 90° position, in 
line with the near-ear, the level increases to +0.68 dB, because the source is 
actually slightly closer. The 20 cm distance line shows a gain of 13.4 dB at azimuth 
0°, because, naturally, it is closer, and, again, the level increases as the sound 

25 source moves aroimd to the 90° position, to 18.1: a much greater increase this time. 
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The other distance lines show intermediate properties between these two 
extremes. 

Next, consider the near-ear gain factor. This is depicted graphically in 
Figure 11. As can be seen from the graph, the 100 cm line is equal to 0 dB at 
azimuth 0° (as one expects), but here, as the sound source moves around to the 90 
position, away from the far-ear, the level decreases to -0.99 dB. The 20 cm 
distance line shows a gain of 13.8 dB at azimuth 0°, similar to the equidistant near- 
ear, and, again, the level decreases as the soimd source moves arotmd to the 90 
position, to 9.58: a much greater decrease than for the 100 cm data. Again, the 
other distance lines show intermediate properties between these two extremes. 

It has been shown that a set of HRTF gain factors suitable for creating near- 
field effects for virtual sound sources can be calculated, based on the specified 
azimuth angle and required distance. However, in practice, the positional data is 
usually specified in spherical co-ordinates, namely: an angle of azimuth, 0, and an 
angle of elevation, <|) (and now, according to the invention, distance, d). 
Accordingly, it is required to compute and transform this data into an equivalent 
h-plane azimuth angle (and in the range 0° to 90°) in order to compute the 
appropriate L and R gain factors, using equations (3) and (7). This can require 
significant computational resource, and, bearing in mind that the CPU or 
dedicated DSP will be running at near-full capacity, is best avoided if possible. 

An alternative approach would be to create a universal "look-up" table, 
featuring L and R gain factors for all possible angles of azimuth and elevation 
(typically around 1,111 in an HRTF library), at several specified distances. Hence 
this table, for four specified distances, would require 1,111 x 4 x 2 elements (8,888), 
and therefore would require a significant amoxmt of computer memory allocated 
to it. 

The inventors have, however, realised that the time-delay carried in each 
HRTF can be used as an index for selecting the appropriate L and R gain factors. 
Every inter-aural time-delay is associated with a horizontal plane equivalent, 
which, in turn, is associated with a specific azimuth angle. This means that a much 
smaller look-up table can be used. An HRTF library of the above resolution 
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features horizontal plane increments of 3°, such that there are 31 HRTFs in the 
range 0° to 90°. Consequently, the size of a time-delay-indexed look-up table 
would be 31 X 4 X 2 elements (248 elements), which is only 2.8% the size of the 
"universal" table, above. 
5 The final stage in the description of the invention is to tabulate measured, 

horizontal-plane, HRTF time-delays in the range 0° to 90° against their azimuth 
angles, together with the near-ear and far-ear gain factors derived in previous 
sections. This links the time-delays to the gain factors, and represents tiie look-up 
table for use in a practical system. This data is shown below in the form of Table 1 
10 (near-ear data) and Table 2 (far-ear data). 
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Time- 


AZirnuin 




ri — An 

KM — *^\J 


d — 60 


d = 80 


d = 100 




(degrees) 


(cm) 


(cm) 


(cm) 


(cm) 


(cm) 


(samples) 














0 


0 


13.41 


7.81 


4.37 


1.90 


-0.02 


1 


3 


13.56 


7.89 


4.43 


1.94 


0.01 


2 


6 


13.72 


7.98 


4.48 


1.99 


0.04 


4 


9 


13.88 


8.06 


4.54 


2.03 


0.08 


5 


12 


14.05 


8.15 


4.60 


2.07 


0.11 


6 


15 


14.22 


8.24 


4.66 


2.11 


0.15 


7 


18 


14.39 


8.32 


4.71 


2.16 


0.18 


8 


21 


14.57 


8.41 


4.77 


2.20 


0.21 


q 


24 


14.76 


8.50 


4.83 


2.24 


0.25 


10 


27 


14.95 


8.59 


4.88 


2.28 


0.28 


1 1 


30 


15.14 


8.68 


4.94 


2.32 


0.31 


12 


33 


15.33 


8.76 


4.99 


2.36 


0.34 


13 


36 


15.53 


8.85 


5.05 


2.40 


0.37 


14 


39 


15.73 


8.93 


5.10 


2.44 


0.40 


15 


42 


15.93 


9.01 


5.15 


2.48 


0,43 


16 


45 


16.12 


9.09 


5.20 


2.51 


0.46 


18 


48 


16.32 


9.17 


5.25 


2.55 


0.49 


1Q 


51 


16.51 


9.24 


5.29 


2.58 


0.51 


20 


54 


16.71 


9.32 


5.33 


2.61 


0.53 


21 

Cm 1 


57 


16.89 


9.38 


5.37 


2.64 


0.56 


23 


60 


17.07 


9.44 


5.41 


2.66 


0.58 


24 


63 


17.24 


9.50 


5.44 


2.69 


0.59 


25 


66 


17.39 


9.55 


5.48 


2.71 


0.61 


26 


69 


17.54 


9.60 


5.50 


2.73 


0.63 


27 


72 


17.67 


9.64 


5.53 


2.74 


0.64 


27 


75 


17.79 


9.68 


5.55 


2.76 


0.65 


28 


78 


17.88 


9.71 


5.57 


2.77 


0.66 


28 


81 


17.96 


9.73 


5.58 


2.78 


0.67 


29 


84 


18.02 


9.75 


5.59 


2.79 


0.67 


29 


87 


18.05 


9.76 


5.59 


2.79 


0.68 


29 


90 


18.06 


9.76 


5.60 


2.79 


0.68 



Table 1 

Time-delay based look-up table for determining near-ear gain 
factor as function of distance between virtual sound source and 
centre of tiie head. 
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Time- 
Delay 

(samples) 


Azimuth 

(deQcces) 


d = 20 

(cm) 


d = 40 
(cm) 


d = 60 


a = ou 


r4 — 1 nn 

U — 1 uu 
(cm) 


n 
u 


0 


13 38 


7.81 


4.37 


1.90 


-0.02 


1 


ri 


13.22 


7.72 


4.31 


1.86 


-0.06 




u 


13 07 


7.64 


4.26 


1,82 


-0.09 




Q 




7 ^6 


4.20 


1.77 


-0.13 


D 


19 
I 


1P 77 


7 48 


4 15 


1.73 


-0.16 


£^ 
D 


1 o 




7 40 


4 09 


1.69 


-0.19 


"7 
/ 


1 O 


1 il.*+0 


7 3P 


4 04 


1.65 


-0.23 


o 
o 


^ 1 


i /L.OO 


7 94 


Q8 

0.57C/ 


1 61 


-0.26 


y 


OA 


1 9 1 Q 


7 1 

/.ID 


0.v70 


1 57 


-0.29 


10 


07 
Z/ 


1 9 OR 


7 Oft 


^ ftft 


1 53 


-0.33 


1 1 


on 


1 1 Q9 


7 ni 


ftp 


1 49 


-0.36 


12 


QO 

oo 


1 1 TQ 


o.yo 


77 


1 45 


-0.39 


13 


OA 


1 1 Rfi 
1 1 .OD 


O.OD 


^ 79 
o. / ^ 


1 41 


-0.42 


14 


QQ 


1 1 


A 7ft 

O- / O 


87 


1 37 


-0.46 


15 


4^ 


1 1 AH 


A 71 

O. / 1 


81 


1 33 


-0.49 


16 


M-O 


1 1 9*7 


o.oo 


*58 


1 29 


-0.52 


18 


>IQ 


1 1 . lO 


O.QO 




1 95 


-0.55 


iy 


O 1 


1 1 n*^ 

i i .UO 


4Q 


3 46 


1 21 


-0.58 


20 


o4 


in oi 
1 u.y 1 


49 


41 


1 17 


-0.62 


21 


c:t 
O/ 


in 7Q 


D.OO 


3 38 


1 13 


-0.65 


23 


bU 


in AT 


fi 97 


3 31 


1 09 


-0.68 


24 


DO 


in 


R 90 


3 P6 


1.05 


-0.71 


2o 


OD 


in 44 


8 14 


3 21 


1.01 


-0.74 


26 


69 


10.33 


6.07 


3.16 


0.97 


-0.77 


27 


72 


10.22 


6.00 


3.11 


0.94 


-0.80 


27 


75 


10.11 


5.93 


3.06 


0.90 


-0.84 


28 


78 


10.00 


5.86 


3.01 


0.86 


-0.87 


28 


81 


9.89 


5.80 


2.97 


0.82 


-0.90 


29 


84 


9.78 


5.73 


2.92 


0.79 


-0.93 


29 


87 


9.68 


5.66 


2.87 


0.75 


-0.96 


29 


90 


9.58 


5.60 


2.82 


0.71 


-0.99 



Table 2 

Time-delay based look-up table for determining far-ear gain factor 
as function of distance between virtual sound source and centre 
of the head. 



10 
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Note that the time-delays in the above tables are shown in units of sample 
periods related to a 44.1 kHz sampling rate, hence each sample imit is 22.676 ^s. 

Cor\sider, by way of example, the case when a virtual sound source is 
required to be positioned in the horizontal plane at an azimuth of 60°, and at a 
distance of 0.4 metres. Using Table 1, the near-ear gain which must be applied to 
the HRTF is shown as 9.44 dB. and the far-ear gain (from Table 2) is 6.27 dB. 

Consider, as a second example, the case when a virtual soxmd source is 
required to be positioned out of the horizontal plane, at an azimuth of 42° and 
elevation of -60°, at a distance of 0.2 metres. The HRTF for this particular spatial 
position has a time-delay of 7 sample periods (at 44.1 kHz). Cor\sequently, using 
Table 1, the near-ear gain which must be applied to the HRTF is shown as 14.39 
dB, and the far-ear gain (from Table 2) is 12.48 dB. (This HRTF time-delay is the 
same as that of a horizontal-plane HRTF with an azimuth value of 18°). 

The implementation of the invention is straightforward, and is depicted 
schematically in Figure 9. Figure 8 shows the conventional means of creating a 
virtual soimd source, as follows. First, the spatial position of the virtual sotmd 
source is specified, and used to select an HRTF appropriate to that position. The 
HRTF comprises a left-ear fimction, a right-ear function and an inter-aural time- 
delay value. In a computer system for creating the virtual sound source, the HRTF 
data will generally be in the form of FIR filter coefficients suitable for controlling a 
pair of FIR filters (one for each charmel), and the time-delay will be represented by 
a number. A monophonic sound source is then transmitted into the signal- 
processing scheme, as shown, thus creating both a left- and right-hand charmel 
outputs. (These output signals are then suitable for onward transmission to the 
listener's headphones, or crosstalk-cancellation processing for loudspeaker 
reproduction, or other mear\s). 

The invention, shown in Figure 9, supplements this procedure, but requires 
little extra computation. This time, the signals are processed as previously, but a 
near-field distance is also specified, and, together with the time-delay data from 
the selected HRTF, is used to select the gain for respective left and right channels 
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from a look-up table; this data is then used to control the gain of the signals before 
they are output to subsequent stages, as described before. 

The left channel output and the right channel output shown in Figure 9 can 
be combined directly with a normal stereo or binaural signal being fed to 
5 headphones, for example, simply by adding the signal in corresponding charmels. 
If the outputs shown in Figure 9 are to be combined with those created for 
producing a 3D soxind-field generated, for example, by biimural synthesis (such 
as, for example, using the Sensaura (Trade Mark) method described in EP-B- 
0689756), then the two output signals should be added to the corresponding 
10 channels of the binaural signal after transaural crosstalk compensation has been 
performed. 

Although in the example described above the setting of magnitude of the 
left and right signals is performed after modification using a head response 
transfer function, the magnitudes may be set before such signal processing if 
15 desired, so that the order of the steps in the described method is not an essential 
part of the invention. 

Although in the example described above tiie position of the virtual soxmd 
source relative to the preferred position of a listener's head in use is coiwtant and 
does not change with time, by suitable choice of sucessive different positions for 
20 the virtual soimd source it can be made to move relative to the head of the listener 
in tise if desired. This apparent movement may be provided by changing the 
direction of the virtual souce from the preferred position, by changing the distance 
to it, or by changing both together. 

Filially, the content of the accompanying abstract is hereby incorporated 
25 into this description by reference. 
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CLAIMS 

A method of processing a single channel audio signal to provide an audio 
signal having left and right charmels corresponding to a sound source at a 
given direction in space relative to a preferred position of a listener in use, 
the information in the charmels including cues for perception of the 
direction of said single charmel audio signal from said preferred position, 
the method including the steps of: a) providing a two channel signal having 
the same single charmel signal in the two channels; b) modifying the two 
channel signal by modifying each of the channels using one of a plurality of 
head response transfer functioi\s to provide a right signal in one charmel 
for the right ear of a listener and a left signal in the other charmel for the left 
ear of the listener; and c) introducing a time delay between the charmels 
corresponding to the inter-aural time difference for a sigrial coming from 
said given direction, the inter-aural time difference providing cues to 
perception of the direction of the soxmd source at a given time, 
characterised in that the method includes controlling the magnitude of the 
left signal and ti\e right signal to be at respective values at said given time, 
the values being chosen to provide cues for perception of the distance of 
said soimd source from said preferred position at said given time. 
A method of processing a single charmel audio signal as claimed in claim 1 
in which the left signal magnitude and the right signal magnitude are 
chosen separately. 

A method as claimed in any preceding claim in which the left ear signal 
inagnitude and right ear signal magnitude are determined by choosing a 
position for the sound source relative to said preferred position of the head 
of a listener in use, determining the distance from the chosen position of the 
sound source to respective ears of said listener, and determining the 
corresponding left signal magnitude and right signal magnitude using the 
inverse square law dependence of sound intensity with distance. 
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4. A method as claimed in claim 3 in which the distance from the chosen 
position of the soimd source, at said given time, to respective ears of said 
listener is determined from a look-up table. 

5. A method as claimed in claim 3 in v^^hich the distance from the position of 
the sound source, at said given time, to the centre of the head of said 
listener is chosen, and the distance to respective ears is determined from 
the inter-aural time delay. 

6. A method as claimed in claim 5 in which the distance to respective ears is 
determined from a look-up table. 

7. A method as claimed in any preceding claim in which the magnitude of the 
left signal or the magnitude of the right signal is sufficiently small as to be 
inaudible. 

8. A method as claimed in any preceding claim in which the left signal and 
right signal are compensated to cancel or reduce transaural crosstalk when 
supplied as left and right chaimels for replay by loudspeakers. 

9. A method as claimed in any preceding claim in which the resulting two 
channel audio signal is combined with a further two or more channel audio 
signal. 

10. A method as claimed in claim 9 in which the signals are combined by 
adding the content of corresponding channels to provide a combined signal 
having two chaimels. 

11. A computer program for implementing a method as claimed in any 
preceding claim. 

12. Apparatus for performing the method as claimed in any preceding claim. 

13. An audio signal processed by a method as claimed in any of claims 1 - 10. 
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